Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allow default_security_group to have no rules #1123

Conversation

britthouser
Copy link

@britthouser britthouser commented Sep 30, 2024

Description

The current implementation of the aws_default_security_group uses dynamic blocks for both ingress and egress rules. If no rules are passed in, then no dynamic blocks are generated, and no pre-existng rules are changed. In order to implement the CIS benchmark of no rules, the aws_default_security_group resource needs to be created passing empty lists as ingress and egress rules.

This commit updates the default ingress/egress rules to be those AWS uses when it initially created the default SG. It then sets the boolen local.empty_default_security_group if both ingress and egress rules passed in are empty lists. Finally it conditionally creates empty aws_default_security_group resource if local.empty_default_security_group is true. If local.empty_default_security_group is false, the original aws_default_security_group resource is utilized.

Motivation and Context

#759
hashicorp/terraform-provider-aws#20697

Breaking Changes

Yes and No. If the user has specified any custom security group rules, then those are honored and won't break. If the user has not specified any custom security group rules, and has not changed the default rules which AWS created when the security group was created, then those are preserved and won't break. But if the user has not specified any custom security group rules AND outside of terraform changed the rules to something different from the ones AWS supplies, then those will get reverted back to the AWS supplied rules.

How Has This Been Tested?

  • I have updated at least one of the examples/* to demonstrate and validate my change(s)
  • I have tested and validated these changes using one or more of the provided examples/* projects
  • I have executed pre-commit run -a on my pull request

@britthouser britthouser changed the title Allow default_security_group to have no rules fix: Allow default_security_group to have no rules Sep 30, 2024
The current implementation of the aws_default_security_group
uses dynamic blocks for both ingress and egress rules. If no
rules are passed in, then no dynamic blocks are generated,
and no pre-existng rules are changed. In order to implement the
CIS benchmark of no rules, the aws_default_security_group
resorce needs to be created passing empty lists as ingress
and egress rules.

This commit updates the default ingress/egress rules to be
those AWS uses when it initially created the default SG.
It then sets the boolen local.empty_default_security_group
if both ingress and egress rules passed in are empty lists.
Finally it conditionally creates empty aws_default_security_group
resource if local.empty_default_security_group is true.  If
local.empty_default_security_group is false, the original
aws_default_security_group resource is utilized.
@britthouser
Copy link
Author

@bryantbiggs - would love to get your thoughts on this implementation whenever you get a minute.

@britthouser
Copy link
Author

@antonbabenko - could I get a review?

@bryantbiggs
Copy link
Member

I don't think this is a valid change that we should accept here. if you are concerned with CIS and security recommended practices, then I would suggest deleting all default VPCs and security groups created by the AWS account creation process and only use VPCs you've defined and created within Terraform. There are a number of bash/python scripts on the internet that will go through and delete these resources for you, and I believe on Control Tower you now have the option to NOT create these resources when creating/vending new accounts

@bryantbiggs bryantbiggs closed this Oct 8, 2024
@britthouser
Copy link
Author

A point of clarification, this isn't about default VPC(s) AWS creates, but the VPC default security group, which is part of EVERY VPC creation, even via terraform. There is no way to delete this security group. This VPC module has code to "adopt" the default security group already. This MR just updates the behavior when the user specifies empty security group rules. Can we take a second look @bryantbiggs ?

@bryantbiggs
Copy link
Member

you're starting with a PR and not an issue that describes the situation and provides a reproduction - I tried using one of our samples https://github.com/terraform-aws-modules/terraform-aws-vpc/blob/master/examples/simple/main.tf and I get a single security group with no inbound or outbound rules - so what more is missing here?

@britthouser
Copy link
Author

Issue #759 describes the problem. I listed it under the motivations in the PR description, but if there a different way I should be linking it, I'm happy to do so.
I think you will see the issue more readily if you create the VPC using the simple example, then modify the security rules outside of terraform, and then run the terraform again. The rules you created outside of terraform won't get deleted.

@bryantbiggs
Copy link
Member

but again - if you are using Terraform, why should this module worry about things that you are creating outside of Terraform? this isn't a module issue - its a user issue

@britthouser
Copy link
Author

Modifying it outside of terraform was just an easy way to see the problem. It doesn't matter how the rules are added to the default security group. The problem manifests when you try to remove all the rules of the default security group. With the current implementation, its not possible to go from some rules to no rules. That's the behavior I'm trying to fix here. I do apologize if I've not been clear up to this point. But hopefully you can see the use case now: The default security group has had rules applied at some point in the past, and now we need to remove them.

@bryantbiggs
Copy link
Member

Modifying it outside of terraform was just an easy way to see the problem.

The problem though is exactly what you are doing - modifying things outside of Terraform. This is a known thing - so either you can stop doing that (strongly recommended practice when using IaC), or try to get the provider to support something that will ensure things created outside of Terraform are reconciled hashicorp/terraform-provider-aws#4426

@britthouser
Copy link
Author

britthouser commented Oct 8, 2024

Here is a demonstration of the problem, completely within terraform, using just this module. I start out by by doing a terratform apply of the examples/simple/main.tf, and pull the default security group ID from the outputs:

Outputs:
...
default_security_group_id = "sg-0cb784f8aefb03846"

Just to validate I read (not change) the default security group, and see that indeed it has no rules, as that is the module default.

% aws ec2 describe-security-group-rules --filters Name=group-id,Values=sg-0cb784f8aefb03846
{
    "SecurityGroupRules": []
}

Now I modify the terraform to specify ingress and egress rules for the security group, and I run terraform apply again.

% git diff
diff --git a/examples/simple/main.tf b/examples/simple/main.tf
index 3249771..05b24f1 100644
--- a/examples/simple/main.tf
+++ b/examples/simple/main.tf
@@ -28,6 +28,24 @@ module "vpc" {
   name = local.name
   cidr = local.vpc_cidr

+  default_security_group_ingress = [
+    {
+      protocol  = -1
+      self      = true
+      from_port = 0
+      to_port   = 0
+    },
+  ]
+  default_security_group_egress = [
+    {
+      from_port        = 0
+      to_port          = 0
+      protocol         = "-1"
+      cidr_blocks      = "0.0.0.0/0"
+      ipv6_cidr_blocks = "::/0"
+    },
+  ]
+
   azs             = local.azs
   private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]

Checking the default security group rules a second time, we can see that terraform apply updates the rules as requested:

% aws ec2 describe-security-group-rules --filters Name=group-id,Values=sg-0cb784f8aefb03846
{
    "SecurityGroupRules": [
        {
            "SecurityGroupRuleId": "sgr-0add868c2065db78a",
            "GroupId": "sg-0cb784f8aefb03846",
            "GroupOwnerId": "622268126582",
            "IsEgress": true,
            "IpProtocol": "-1",
            "FromPort": -1,
            "ToPort": -1,
            "CidrIpv4": "0.0.0.0/0",
            "Tags": []
        },
        {
            "SecurityGroupRuleId": "sgr-07099de587ef5bb97",
            "GroupId": "sg-0cb784f8aefb03846",
            "GroupOwnerId": "622268126582",
            "IsEgress": false,
            "IpProtocol": "-1",
            "FromPort": -1,
            "ToPort": -1,
            "ReferencedGroupInfo": {
                "GroupId": "sg-0cb784f8aefb03846",
                "UserId": "622268126582"
            },
            "Tags": []
        },
        {
            "SecurityGroupRuleId": "sgr-0c58a13a29bdda40b",
            "GroupId": "sg-0cb784f8aefb03846",
            "GroupOwnerId": "622268126582",
            "IsEgress": true,
            "IpProtocol": "-1",
            "FromPort": -1,
            "ToPort": -1,
            "CidrIpv6": "::/0",
            "Tags": []
        }
    ]
}

Next, I revert the terraform back to the main.tf as it exists in the repo, and re-run apply.

% git restore main.tf
% git diff
% aws-vault exec PHR-Plat-Dev -- terraform apply
data.aws_availability_zones.available: Reading...
module.vpc.aws_vpc.this[0]: Refreshing state... [id=vpc-074fadbc37076c315]
data.aws_availability_zones.available: Read complete after 0s [id=us-east-1]
module.vpc.aws_default_security_group.this[0]: Refreshing state... [id=sg-0cb784f8aefb03846]
module.vpc.aws_default_route_table.default[0]: Refreshing state... [id=rtb-0b11e962cb26c1e57]
module.vpc.aws_subnet.private[1]: Refreshing state... [id=subnet-0589aebe64c0c51fe]
module.vpc.aws_subnet.private[0]: Refreshing state... [id=subnet-0c6fec1b8ce91c427]
module.vpc.aws_subnet.private[2]: Refreshing state... [id=subnet-01cae0bd3de566a73]
module.vpc.aws_default_network_acl.this[0]: Refreshing state... [id=acl-01d26e450e2a6ad13]
module.vpc.aws_route_table.private[0]: Refreshing state... [id=rtb-05c0dfd29898ccfd8]
module.vpc.aws_route_table.private[1]: Refreshing state... [id=rtb-0a03c3a27755c9656]
module.vpc.aws_route_table.private[2]: Refreshing state... [id=rtb-073bdfa36dcd2246d]
module.vpc.aws_route_table_association.private[1]: Refreshing state... [id=rtbassoc-03d4dfca82469b8db]
module.vpc.aws_route_table_association.private[0]: Refreshing state... [id=rtbassoc-0eb4835a987209242]
module.vpc.aws_route_table_association.private[2]: Refreshing state... [id=rtbassoc-083216e582b4a1546]

No changes. Your infrastructure matches the configuration.

Terraform reports not changes to the infra, when in fact I have specified a change! I hope you can see how this is a problem, even when only terraform is used to change the security group rules. With the code currently in the repo, you can NEVER go from some rules to no rules in the default security group. The PR I've proposed allows this change to happen. I hope that clarifies?

@bryantbiggs
Copy link
Member

hey - look at that! a proper reproduction with steps and output. imagine if we had that from the start

@bryantbiggs
Copy link
Member

bryantbiggs commented Oct 8, 2024

however, your changes still are not valid at this time - that would be a breaking change to adopt the changes proposed (and I don't know if we would adopt those even if we were making a breaking change. If AWS changes the defaults at some point, now what do we do - if we don't have an opinion on a default, we simply do nothing)

in your example above - you can work around this by simply setting: edit: turns out you can't, the resource doesn't respect it

  default_security_group_ingress = []
  default_security_group_egress =  []

@britthouser
Copy link
Author

Okay - so we have established that there is indeed a problem here. So lets discuss what todo about it. I tried your workaround, and it does not remove the rules. I did a deployment using the examples/sample/main.tf with added some security group rules defined:

% git diff
diff --git a/examples/simple/main.tf b/examples/simple/main.tf
index 3249771..0e8f216 100644
--- a/examples/simple/main.tf
+++ b/examples/simple/main.tf
@@ -27,6 +27,23 @@ module "vpc" {

   name = local.name
   cidr = local.vpc_cidr
+  default_security_group_ingress = [
+    {
+      protocol  = -1
+      self      = true
+      from_port = 0
+      to_port   = 0
+    },
+  ]
+  default_security_group_egress = [
+    {
+      from_port        = 0
+      to_port          = 0
+      protocol         = "-1"
+      cidr_blocks      = "0.0.0.0/0"
+      ipv6_cidr_blocks = "::/0"
+    },
+  ]

   azs             = local.azs
   private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]

As expected, terraform created the rules:

% aws ec2 describe-security-group-rules --filters Name=group-id,Values=aws ec2 describe-security-group-rules --filters Name=group-id,Values=sg-041a65e6b377ac668
{
    "SecurityGroupRules": [
        {
            "SecurityGroupRuleId": "sgr-07c7b6a273f269889",
            "GroupId": "sg-041a65e6b377ac668",
            "GroupOwnerId": "622268126582",
            "IsEgress": true,
            "IpProtocol": "-1",
            "FromPort": -1,
            "ToPort": -1,
            "CidrIpv6": "::/0",
            "Tags": []
        },
        {
            "SecurityGroupRuleId": "sgr-056985782beef6544",
            "GroupId": "sg-041a65e6b377ac668",
            "GroupOwnerId": "622268126582",
            "IsEgress": true,
            "IpProtocol": "-1",
            "FromPort": -1,
            "ToPort": -1,
            "CidrIpv4": "0.0.0.0/0",
            "Tags": []
        },
        {
            "SecurityGroupRuleId": "sgr-016814fa496d77db8",
            "GroupId": "sg-041a65e6b377ac668",
            "GroupOwnerId": "622268126582",
            "IsEgress": false,
            "IpProtocol": "-1",
            "FromPort": -1,
            "ToPort": -1,
            "ReferencedGroupInfo": {
                "GroupId": "sg-041a65e6b377ac668",
                "UserId": "622268126582"
            },
            "Tags": []
        }
    ]
}

I updated the terraform with your proposed change:

% git diff
diff --git a/examples/simple/main.tf b/examples/simple/main.tf
index 3249771..4eb4b71 100644
--- a/examples/simple/main.tf
+++ b/examples/simple/main.tf
@@ -27,6 +27,8 @@ module "vpc" {

   name = local.name
   cidr = local.vpc_cidr
+  default_security_group_ingress = []
+  default_security_group_egress = []

   azs             = local.azs
   private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]

As you can see, the apply still doesn't invoke any change:

 % terraform apply
data.aws_availability_zones.available: Reading...
module.vpc.aws_vpc.this[0]: Refreshing state... [id=vpc-0ac807f067f1c9337]
data.aws_availability_zones.available: Read complete after 0s [id=us-east-1]
module.vpc.aws_default_route_table.default[0]: Refreshing state... [id=rtb-0848cc7d14d7458f3]
module.vpc.aws_default_security_group.this[0]: Refreshing state... [id=sg-041a65e6b377ac668]
module.vpc.aws_subnet.private[0]: Refreshing state... [id=subnet-0e47a131f5ce29aa7]
module.vpc.aws_subnet.private[2]: Refreshing state... [id=subnet-0b354aff46159adb0]
module.vpc.aws_default_network_acl.this[0]: Refreshing state... [id=acl-08879e5f42cf0b46e]
module.vpc.aws_subnet.private[1]: Refreshing state... [id=subnet-0b10533e0f8e225af]
module.vpc.aws_route_table.private[0]: Refreshing state... [id=rtb-0240ea6c004952ffd]
module.vpc.aws_route_table.private[2]: Refreshing state... [id=rtb-010feff8fe8319723]
module.vpc.aws_route_table.private[1]: Refreshing state... [id=rtb-069c655f1dacb897b]
module.vpc.aws_route_table_association.private[0]: Refreshing state... [id=rtbassoc-00db2871d486bf5f8]
module.vpc.aws_route_table_association.private[1]: Refreshing state... [id=rtbassoc-03023201c8f342b9b]
module.vpc.aws_route_table_association.private[2]: Refreshing state... [id=rtbassoc-0618bbd7693c27a8f]

No changes. Your infrastructure matches the configuration.

This is due to the fact that the empty list [] is not passed through to the aws_default_security_group resource due to the use of the dynamic blocks.

@britthouser
Copy link
Author

Please let me know what is not valid about the PR itself. I was on the fence about what to specify as the default values for default_security_group_ingress and default_security_group_egress. If its better to leave those as empty lists, I'm happy to update the PR. If there's a simpler way to solve this, please point me in the right direction.

@bryantbiggs
Copy link
Member

If there's a simpler way to solve this, please point me in the right direction.

as I stated above - go get it changed in the provider. If we follow your example

  1. Deploy the simple VPC example as is
  2. Add and deploy this - this is bare bones resource, nothing to do with the module just to make it very clear:
resource "aws_default_security_group" "this" {
  vpc_id = module.vpc.vpc_id

  ingress {
      protocol  = -1
      self      = true
      from_port = 0
      to_port   = 0
    }

  egress {
      from_port        = 0
      to_port          = 0
      protocol         = "-1"
      cidr_blocks      = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
}
  1. Remove the ingress/egress rules from said resource and observe:
resource "aws_default_security_group" "this" {
  vpc_id = module.vpc.vpc_id
}

And lastly - your suggested changes are semantically incorrect - ingress and egress are both the wrong type and syntax

image

I am going to lock this issue because I don't have any additional time to continue down this path - its simply incorrect

@terraform-aws-modules terraform-aws-modules locked as resolved and limited conversation to collaborators Oct 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants